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Abstract 

This recreational paper investigates what happens if we change quantum mechanics in several ways. 
The main results are as follows. First, if we replace the 2-norm by some other p-norm, then there are 
no nontrivial norm-preserving linear maps. Second, if we relax the demand that norm be preserved, 
we end up with a theory that allows rapid solution of PP-complete problems (as well as superluminal 
signalling). And third, if we restrict amplitudes to be real, we run into a difficulty much simpler than 
the usual one based on parameter-counting of mixed states. 

1 Introduction 

"It is striking that it has so far not been possible to find a logically consistent theory that is close 
to quantum mechanics, other than quantum mechanics itself." — Steven Weinberg, Dreams of a 
Final Theory |13| 

The title of this paper should be self-explanatory, but if it isn't: "theoryspace" is the space of logically 
conceivable physical theories, with two theories close to each other if they differ in few respects. An "island" 
in theoryspace is a natural and interesting theory, whose neighbors are all somehow perverse or degenerate.^ 
The Standard Model isn't an island, because we don't know any compelling (non-anthropic) reason why the 
masses and coupling constants should have the values they do.^ Likewise, general relativity is probably not 
an island, because of alternatives such as the Brans-Dicke theory. 

To many physicists, however, quantum mechanics does seem like an island: change any one aspect, and 
the whole structure collapses. This view is buttressed by three types of results: 

(1) "Derivations" of the \ip\^ probability rule. Gleason's Theorem ^ shows that, in a Hilbert 
space of dimension 3 or higher, the usual quantum probability rule is the only one consistent with a 
requirement of noncontextuality. Deutsch 7 and Zurek jl4| derived the rule from other assumptions. 

(2) Arguments for complex amplitudes. If / (n) is the number of real parameters needed to specify 
an n-dimensional mixed state, then only when amplitudes are complex numbers does f {nAns) 

f {tia) f (ns) (since / (n) = n^). With real amplitudes, f (n) = n{n + l) /2 and thus /(n^ns) > 
f (tia) f (tib)- With quaternionic amplitudes, f {n) — 2n^ — n and thus f {nAns) < f {nA) f (ns)- 
Caves, Fuchs, and Schack exploited this observation to show that a "quantum de Finetti Theorem" 
(which justifies Bayesian reasoning) works only if amplitudes are complex. Hardy |10| also made 
essential use of the observation in his derivation of quantum mechanics from "five simple axioms." 

(3) "Perverse" consequences of nonlinearity. After Weinberg proposed nonlinear variants of 
the Schrodinger equation, Gisin |S] and Polchinski ^I] independently observed that almost all such 

'University of California, Berkeley. Email: aaronson@cs.berkeley.edu. Supported by an NSF Graduate Fellowship, by NSF 
ITR Grant CCR-0121555, and by the Defense Advanced Research Projects Agency (DARPA). 

bit of pedantry: a physicist might call the neighbors of quantum mechanics I'll discuss "inconsistent," since they contradict 
auxiliary assumptions that the physicist considers obvious. I'll stick to milder epithets like "perverse." 

^More pedantry: whether a theory is an island is therefore a function of our knowledge, not just of the theory itself. 
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variants would allow superluminal signalling. Later Abrams and Lloyd PP argued that a "nonlinear 
quantum computer" could solve NP-complete and even #P-complete problems in polynomial time.^ 

This paper won't attempt another axiomatic derivation like that of Hardy lOj — its more modest goal is 
just to stroll through quantum mechanics' neighborhood of theoryspace. All mathematical results in this 
paper are trivial. So why write it then? Apart from the fact that triviality never stopped a quantum 
philosopher before, I hope to make a point: that if you change quantum mechanics in the most obvious 
ways, you'll run into problems that have nothing to do with the subtleties of contextuality, locality, or 
entanglement. Even in "quantum mechanics lite" — where there are no mixed states, no tensor products, 
and no intermediate measurements, just vectors representing probabilities that get mapped to other vectors — 
you'll need to worry about conservation of probability, and about closure properties of the allowed vector 
maps. 

I won't make this point regarding nonlinear quantum mechanics, for the simple reason that there it seems 
false. Contrary to what I originally thought, one can define a large, natural class of discrete norm-preserving 
nonlinear gates. This class includes "Weinberg gates" such as 



W 



X 

y i = I e'Vy 



as well as "polynomial gates" such as 



G 



y ) \ 2Rexy 



Since ||G(u)||2 — \\v\\2, the gate G preserves the 2-norni of v provided ||u||2 = 1. As far as I can tell, 
any argument for the implausibility of W or G needs to be based on physical effects, such as superluminal 
signalling or efficient solubility of NP-complete problems. 

The paper is (not very well) organized as follows. Section [21 shows that when p ^ 2, the only p- norm- 
preserving linear transformations are permutations of diagonal matrices. In other words, if you want to base 
quantum mechanics on a p-norm other than the 2-norm, then you'll need to include some sort of "manual 
normalization." However, manual normalization brings with it most of the hazards of nonlinearity: super- 
luminal signalling, distinguishability of non-orthogonal states, and polynomial-time solubility of "obviously 
hard" problems.^ Section O addresses the last point in detail, by using the concept oi posts election to study 
the computational power of alternative quantum theories. The punchline, which might be of independent 
interest to computer scientists, is that all the alternative theories considered have at least the power of the 
complexity class PP,"" and many have exactly the power of PP. 

Finally, Section Ogives an argument for why amplitudes are complex rather than real, that has nothing 
to do with the parameter-counting arguments of Refs. [SlIBlllOj. Unfortunately, my argument says nothing 
about why amplitudes are complex rather than quaternionic. 



2 Other ]9-Norms 

"Addition in proof: More careful considerations show that the probability is proportional to the 
square of the [amplitude] $„rm-" — Max Born in a footnote to his 1926 paper introducing 
the probability interpretation (the main text says the probability is proportional to $„rm itself) 

No doubt about it: the 2-norm is special. The Pythagorean Theorem, Fermat's Last Theorem, and 
least-squares regression all involve properties of a sum of squares that fail for a sum of cubes or of any other 

''Abrams and Lloyd claimed furthermore that their nonlinear algorithms are robust against small errors. This claim does 
not withstand detailed scrutiny; whether nonlinear quantum computers can solve NP- and #:P-complete problems robustly 
therefore remains an intriguing open problem. On the other hand, if arbitrary 1-qubit nonlinear gates can be implemented 
without error, then even PSPACE-complete problems can be solved in polynomial time. This result is tight, since nonlinear 
quantum computers can also be simulated in PSPACE. These claims will be proved in another paper. 

* NP-complete problems are obviously hard; factoring and graph isomorphism are not. 

^See www.cs.bcrkeley.edu/~aaronson/zoo.html for definitions of over 370 complexity classes. 
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power. Still, given that classical probability theory is based on the 1-norm and quantum mechanics on the 
2-norm, it's natural to wonder what singles out 1 and 2. What happens if we try to base a theory on the 
p-norm^ for some other p? In this section I'll explain why the 2-norm is the only p-norm that permits 
nontrivial norm-preserving linear maps/ 

It's easiest to start with real amplitudes and then generalize to complex ones. We want to know which 
matrices A G R"^" have the property that for all vectors x, \\Ax\\p = where denotes the p-norm. 

We can gain some intuition by counting constraints. When p = I and we restrict our attention to x with 
nonnegative entries, we obtain the set of stochastic matrices, or nonnegative matrices that satisfy n linear 
constraints. When p = 2, we obtain the set of orthogonal matrices, or those A = (ajk) such that 

n 

ajkUki = Ski (1) 

for all k, I. Equation ^ imposes n(n + 1) /2 quadratic constraints on A, cutting the number of parameters 
needed to specify A roughly in half. Continuing, when p — 3 we expect order cubic constraints, when 
p — A, order n'* quartic constraints, and so on. That the number of constraints exceeds the number of 
parameters for p > 2 makes us suspect that p — 2 is the "end of the line." 

But that's not a rigorous argument, because we know there are matrices that are norm-preserving for all 
p: the generalized diagonal matrices (that is, products of permutation matrices and diagonal matrices). To 
show that these are the only norm-preserving matrices, first let p be an even integer greater than 2. Then 
letting X = {xj), the requirement 

n n / n \ f 

for all X implies that the left- and right-hand sides are identical as formal polynomials, and therefore (among 
other constraints) that 



E4^'"' 



2 2 X 
k "-jl = "kl 



for all fc, I. This in turn implies that for all j and k =/= I, either ajk = or aji = 0. But since every column 
must contain nonzero entries by the constraint a^^, = 1 , this implies that A is a generalized diagonal 
matrix. 

Next let p be an odd positive integer. We claim that, so long as xi, . . . , a;„ are nonnegative, the entries 
of Ax never change sign. Clearly there exist si, . . . , s„ e { — 1, 1} such that 

n n 

E4 = E*jyj 

as formal polynomials, where yj = X]fc=i o.jkXk- Suppose by contradiction that, keeping all Xj's nonnegative, 
we could make sgn (?/_,■) Sj = — 1 for some j, where sgn(yj) is if yj = and yj / \ yj\ otherwise. Then 

n n n 

E '^sn [V]) = E = E ""^y^j 

as formal polynomials, which implies that 



E 

sgn(i/j)sj=-l 



sgn{yj)y1 = 0. 



^The main reason for restricting attention to p-norms is their behavior under tensor products: disregarding zany functions 
that depend on the Axiom of Choice, if / (a/?) = / (o) / (/3) for all a, /3 then / (a) must have the form \ct\^. However, it might 
be interesting to consider theories where the probability of measuring a basis state \x) depends on all amplitudes, not just that 
of \x). 

''^When p = all linear maps are norm-preserving, but they have no effect because all outcomes of a measurement are always 
equiprobable. When p = oo only generalized diagonal matrices are norm-preserving, as in the case 2 < p < oo. I refuse even 
to discuss the case p < 0. 
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Since every term in the above sum is nonnegative, we have = for all j such that sgn (yj) Sj = — 1, which 
implies that Ujk = for all j, k such that sgn {yj) Sj = —1, contradiction. 

Since the entries of Ax never change sign when x is nonnegative, it follows that in each row of A, all 
entries have the same sign. So if we define a new matrix B by bjk = \0'jk\, then B also has the property 
that II -B cell p = II a; lip for all x. But then when p > 3, the same reasoning from the case of even p imphes 
that B is generalized diagonal, which implies that A was generalized diagonal as well. When p — 1, B is 
stochastic, and it is easily checked that the only stochastic matrices that preserve the 1-norm of all vectors 
(not just nonnegative ones) are permutation matrices. 

Finally, let p > be an arbitrary real that is not an integer. Let Xj — \xjf; then 



E~i/p 
ajkXk 

fe=i 



for all xi, . . . , Xn- It follows that there exist si, . . . , s„ € {—1, 1} such that 

n n / n \ P 

J = l j=l \ k=l / 

as formal functions. But this implies that A is generalized diagonal, since otherwise the right-hand side 
could never be simplified to a linear function in the aJj-'s. 

So much for real amplitudes. When we generalize to complex amplitudes Xj G C, there are two defensible 
choices: letting Xj = aj + i(3j, we could require either X]j=i (iQ^jT + l/^jD = 1 or X]j=i l^iT = 1; where 

\xj I = 'ij (^'j ~^ Pj usual. Under the first choice, we can consider ai, . . . , a„, . . . , /3„ as a vector of 2n 
reals and A as a 2n x 2n matrix; then the results from the real-amplitude case immediately imply that A is 
generalized diagonal. Under the second choice, we can choose an 7^ and replace it by e^^xi, holding all 
other Xfc's fixed. Then since l-^il^ remains constant as we vary 0, 



Qjie'-^xi + ^ ajkXk 

k^l 



must also remain constant. But when p 7^ 2, this is possible only if for all j, either aji — or X^fc^^i o-jk^k — 0. 
Intuitively, once we sneak the 2-norm in "through the back door" in defining the norm of a complex number, 
consistency forces us to use it everywhere. We omit a proof of this fact, since it follows easily from a case 
analysis similar to that for real amplitudes. 

Stepping back, what can we say about why the 2-norm is special? The standard answer — that the 2- 
norm is special because it's preserved under rotations — merely pushes the question from quantum mechanics 
back to the Pythagorean Theorem. The latter might be thought a good enough place to stop. However, 
although the Pythagorean Theorem dates back some 3800 years, I confess to having never understood it at 
a gut level. (Have you?) So if pressed, I'd instead answer the question as follows: values of p other than 
positive even integers are almost nonstarters, since we want \x\^ to be defined and smooth at a; = 0. But 
when p = 4, 6, 8, . . ., Equation |21 involves terms of the form [ajkXkY {ujiXiY^'^ where q and p — q are both 
positive even integers, and that immediately forces A to be generalized diagonal. So all that's left is p = 2. 

If you still want to define quantum mechanics using a p-norm where p 7^ 2, the only option seems to be 
manual normalization. This means that when a state {ip) = J^x '^^ 1-^) measured in the standard basis, 
the probability of outcome \x) is \c(x\^ / IcKyT- Since keeping l-^) normalized is no longer imperative, 
three options present themselves for how \^p) evolves: 

(i) As usual, \ip) can be mapped to U {tp) where U is any unitary matrix. 

(ii) IV') can be mapped to A\ip) where A is any invertible matrix. 

(iii) {tp) can be mapped to A 1-0), but then local normalization is performed on the subsystem acted on by 
A. 
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To illustrate option (iii), suppose the nonunitary gate 



q r 
s t 



is applied to the second qubit of the normalized state a |00) + /3 101) +7 110) + 5 111). Then the unnormalized 
result is 

{qa + rf3) |00) + (sa + t(3) |01) + (57 + r5) |10) + (57 + 15) |11) , 
so the locally normalized result is 



+ /32 [{qa + r/3) |00) + (.sa + tp) |01)] sfF+P[{q-l + r5) |10) + (57 + tS) |11)] 



^J{qa + r(}f + {sa + 0f 
in contrast to the globally normalized result of 

[qa + r(3) |00) + (sa + tp) |01) + (g7 + r5) |10) + (57 + 15) 
(ga + rpf + (sa + tpf + (97 + r5)^ + (57 + t5f 

So, what's wrong with these prescriptions? Nothing, as long as you can stomach the following: 

(1) Distinguishability of non-orthogonal states. Here's how to distinguish d = Vl (y/p) states of a 

single qubit with constant probability of error, under option (i) (and therefore under (ii) and (iii) as well). 

Let the j*^ state be \^pj) = cos {nj/d) |0) + sin (nj/d) |1) where j G {0, . . . , d — 1}. Apply a, d x d unitary 

matrix to \tjjj) whose first two columns are 



/ cos{nO/d)/Vd 
\ cos (tt {d - 1) /d) /Vd 



sin {irO/d) /Vd 
sin (tt {d - 1) /d) /Vd 



Then measure in the standard basis. Suppose without loss of generality that j = and that d is odd; then 
the probability of any outcome other than being measured is q/ (^ + 1) where 



irk 
~d 





(d-l)/2 




q = 


2 E 


cos 




fc=i 






(d-l)/2 




< 


2 E 


(- 




k=l 






(d-l)/2 




< 


2 E 


(- 




fc=l 






(d-l)/2 




< 


2 ^ exp 

fe=i 



jnk/df ^ {irk/df 



24 



4d2 

ir'^k'^p 
2d2 



which is bounded away from 1 so long as p > cd^ for some constant c. It would be interesting to obtain 
bounds on how many states can be reliably distinguished in higher-dimensional Hilbert spaces. 

(2) Superluminal signalling. Under option (ii), given an EPR pair, Alice can commimicate a bit to 
Bob by mapping |00) + |11) to either |00) +e |11) or e |00) + |11). Indeed, using the ideas from part (1), she 
can communicate fl {^/pj bits to Bob using a single EPR pair! I conjecture this is tight. Under options 
(i) and (iii), Alice can commimicate a bit to Bob given enough EPR pairs, by taking advantage of Bob's 
ability to distinguish nonorthogonal states. Note that under options (ii) and (iii), superluminal signalling 
is possible even when p = 2. 
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(3) Efficient solubility of NP-complete and even harder problems. Suppose you're given a 
Boolean function / : {0,1}" {0,1}- Under option (ii), first prepare 1-^) 1/ (^))' then apply the 
nonunitary gate 

■ 2-2" 



1 j 

to the / register and measure to learn whether there exists an x such that f (x) = 1. Indeed, Section O 
shows that under options (i), (ii), and (iii), you could solve even PP-complete problems in polynomial time, 
which are believed to be harder than NP-complete problems. 

(4) Singularity. Under options (ii) and (iii), the matrix A could be arbitrarily close to a non-invertible 
matrix, which can map nonzero states to the zero state. 



3 Quantum Computing With Postselection 

This section can be skipped by physicists with no interest in computational complexity.® Its goal is to show 
that, if you change quantum mechanics in any of three ways, then the class of problems efficiently solvable 
on a quantum computer expands drastically, from BQP (Bounded-Error Quantum Polynomial-Time) to PP 
(Probabilistic Polynomial-Time). Here PP is a well-studied classical complexity class, consisting of all 
decision problems for which there exists a probabilistic polynomial-time Turing machine that accepts with 
probability at least 1/2 if the answer is "yes," and with probability less than 1/2 if the answer is "no." 
The three changes that would give quantum computers the power of PP are: replacing the 2-norm by the 
p-norm for any p ^ 2,^ allowing arbitrary invertible matrices instead of just unitary matrices, or allowing 
postselection on measurement outcomes. Any combination of these changes would also yield PP. Note, 
however, that I always assume global normalization (corresponding to options (i) and (ii) in Section |21l. 
It will be convenient to define a new complexity class: 

Definition 1 PostBQP (or BQP with postselection) is the class of languages L for which there exists a 
uniform family of polynomial- size quantum circuits such that for all inputs x, 

(i) At the end of the computation, the first qubit has a nonzero probability of being measured to be |1). 

(ii) If X £ L, then conditioned on the first qubit being \\), the second qubit is |1) with probability at least 
2/3. 

(iii) If X <^ L, then conditioned on the first qubit being \1) , the second qubit is |1) with probability at most 
1/3. 

Intuitively, postselection means that at some point in the computation, you can measure a qubit that 
has a nonzero probability of being |1), and assume that the outcome will be |1) (or equivalently, discard all 
runs where the outcome is |0)). Just as Bernstein and Vazirani [3] showed that intermediate measurements 
don't increase the power of ordinary quantum computers, so it's easy to show that intermediate postselection 
steps don't increase the power of PostBQP (since these steps can all be deferred to the end). On the other 
hand, if operations can be performed conditioned on measurement outcomes, then mixing postselection and 
measurement steps could increase the power of PostBQP. 

In the remainder of the section, I'll first show that PostBQP — PP (Theorem|2)l, and then use that result 
to prove that the other changes also give quantum computers the power of PP. 

Theorem 2 PostBQP = PP. 

Proof. The inclusion PostBQP C PP follows easily from the techniques used by Adleman, DeMarrais, and 
Huang 12 to show that BQP C PP. 

For the other direction, let / : {0, 1}" — > {0, 1} be a Boolean function and let s = |{a; : / (a;) = 1}|. Then 
we need to decide in PostBQP whether s < 2"^^ or s > 2""^. (As a technicality, we can guarantee using 

*The rest of paper can be skipped by computational complexity theorists with no interest in physics. 
^If p is not a positive even integer, then the power increases at least to PP and possibly further. 
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padding that s > and s ^ 2"-"^.) The algorithm is as follows: first prepare 2""/^^^^^^ ^^^^ \ j [x)) . 
Then following Abrams and Lloyd |T] , apply Hadamard gates to all n qubits in the first register and postse- 
lect^° on that register being |0)®", to obtain |0)®" where 

,^^^^(2"-.)|0)+.|l) 



y^(2"-s)Vs2 

Next, for some positive real a,/? to be specified later, prepare a |0) l-^s) + /? |1) 10s) where 

2" |0) + (2" - 2s) |1) 



.) 



y/2(2"-s)V2s2 



is the result of applying a Hadamard to | ■)/'«)■ Postselecting on the second qubit being |1) then yields the 

as|0) + yT72/3(2"-2s)|l) 



^a^s"^ + /32 (2" - 2s) V2 



in the first qubit. A simple calculation now reveals that if s < 2" ^, then there exists an integer i in the 
range [— n, n] such that 

|(+|<Ps,2')| > 0.986 

where |+) = (|0) + |1)) /\/2. If s > 2""\ on the other hand, then for all such i we have | (+|(/3s,2i) | < l/\/2. 
So by running the whole algorithm n (2n +1) times in parallel, with n invocations for each integer i e [—n, n] , 
we can learn whether s < 2"~^ or s > 2"~^ with exponentially small probability of error. ■ 

Let BQPnij.giobai be the class of problems solvable by a uniform family of polynomial-size, bounded-error 
quantum circuits, if the circuits can consist of arbitrary invertible gates, not just unitary gates. Option (ii) 
from Section[21is used for normalization; that is, before a measurement, the amplitude of each basis state 

\x) is divided by ^Y.y 
Propositions BQP nu-ghbai = ■ 

Proof. The inclusion BQPnj.giobai ^ PP follows easily from Ref. 2 . For the other direction, by Theorem 
Elit suffices to observe that PostBQP C BQPnu-giobai- To postselect on a qubit being |1), simply apply the 
nonunitary gate G from Equation |21 ■ 

Define BQPnj.iocai similarly to BQPnLi_giotiaii except that after every gate G, option (iii) (local normalization) 
is applied to the qubits acted on by G. Assume that arbitrary 1- and 2-qubit gates are available to 
polynomially many bits of precision. 

Proposition 4 PP C BQPnu-hcai C PSPACE. 

Proof. For PP C BQPnu-iocah observe that in the proof of Theorem[21 the only essential postselection steps 
are applied to 2-qubit pure states unentangled with anything else. For 2-qubit gates acting on these states, 
local normalization is the same as global normalization. 

For BQPnu-iocai Q PSPACE, let ai*'' be the amplitude of basis state |a;) at time t. Then for all x,t 
we can write al^^ as a function of a^* ^\ ■ ■ ■ , ctyl ^' for some constant k and basis states \yi) , . . . , \yk)- 
This immediately implies a depth-first recursive algorithm (using a polynomial amount of memory) for 
approximating any amplitude to polynomially many bits of precision. ■ 

Finally, for any nonnegative real number p, define BQPp similarly to BQP, except that the probability of 
measuring a basis state \x) equals \axf / J2y l^^vf- (Thus BQP2 = BQP.) All gates are unitary. 

Proposition 5 PP C BQPp C P*'' for all constants p ^ 2, and BQPp = PP provided p is an even integer 
greater than 2. 

^''Actually postselection is overkill here, since the first register has at least 1/4 probability of being |0)®". 
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Proof. The inclusion BQPp C P*'' is obvious. To simulate BQPp in PP when p is a positive even integer, 
use the techniques of Ref . [2] (which handle the p = 2 case) , but evaluate polynomials of degree p instead 
of quadratic polynomials. To simulate PP in BQPp when p ^ 2, run the algorithm of Theorem [21 having 
initialized O {v? / \p — 2|) ancilla qubits to |0). To postselect on the b*^ qubit being |1): if p < 2, then apply 
Hadamards to lOpn/ {2 — p) ancilla qubits conditioned on the b*^ qubit being If p > 2, then apply 

Hadamards to lOpn/ (p — 2) ancilla qubits conditioned on the 6*'' qubit being |0). ■ 

4 Real Amplitudes 

"C'mon, they're algebraically closed!" — A math graduate student, when asked why God would 
resort to complex numbers in creating quantum mechanics 

To a beginner, perhaps the most unexpected fact about quantum mechanics is that amplitudes are 
complex. As the term 'imaginary' suggests, we tend to think of complex numbers as (useful) human 
inventions; it's unsettling if the source code of the Universe is best written in a language like Fortran with 
a complex-number data type. Also, in contrast to what we saw in Section |21 restricting amplitudes to 
be real doesn't lead to a theory obviously very different from quantum mechanics. All the greatest hits 
are still there: interference, entanglement. Bell inequality violations, noncommuting observables, non-unique 
decompositions of mixed states, universal quantum computing, the Zeno effect, the Gleason and Kochen- 
Specker theorems. 

Nevertheless, SectionQlrecalled a subtle difference between complex and real (or for that matter complex 
and quaternionic) amplitudes, based on counting the number of parameters of a mixed state. This section 
gives a completely different argument for why amplitudes aren't real. The advantage of this argument is 
that it's elementary and intuitive; the disadvantage is that it says nothing about why amplitudes are complex 
rather than quaternionic. 

Let 5 be a set of states, and let U he a. set of transformations from S to itself. Say U has the square 
root property if for all U GU, there exists another transformation V GU such that V {V (S)) = U (S) for 
all S Cz S. If time is continuous, then the importance of the square root property is obvious: without it 
there are transformations that can't be interpreted as the result of applying a fixed Hamiltonian for some 
interval of time. Even if time is discrete, the square root property is desirable, because it allows any U that 
acts over k time steps to be approximated by V''' for some V that acts over a single time step.^^ Clearly 
quantum mechanics has the square root property: given a unitary U, let \4>i) , ■ ■ ■ , |'0n) be the eigenvectors 
of U and let Ai, . . . , A„ be the corresponding eigenvalues; then there exists a unitary V with eigenvectors 
Itpi) , . . . , \ipn) and eigenvalues . . . , /z„ such that fjlj = Xj, which therefore satisfies — U . Since every 
quaternion has a square root,^^ the same argument shows that quaternionic quantum mechanics has the 
square root property. 

However, real quantum mechanics doesn't have the square root property. This is immediate since 
orthogonal matrices such as 




with determinant —1 can't be written as squares of matrices with real determinants. If we want to restore 
the square root property, then we have two choices. The first choice is to restrict to the group SO (n) — that 
is, to real orthogonal matrices with determinant 1. It's not hard to see that for every U £ SO (rt), there 
exists a V £ SO (n) such that — U. On the other hand, natural 1-qubit operations such as the above 
two can only be implemented by using ancillia qubits. The second choice is to allow the "square root" of U 
to have larger dimension than U. For example, 

1 \ ^ /I \ 
1=0-1 
0-10/ \0 0-l/ 

^^To write U exactly as we'd need "a fc*'' root property," which also holds for quantum mechanics but fails for real 
quantum mechanics. 

^■^Indeed some, such as —1, have infinitely many square roots. 
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contains the 1-qubit phase flip as a 2 x 2 submatrix. This is an instance of a well-known geometrical fact, 
that a mirror reversal in n dimensions can be accomplished by a rotation in n + 1 dimensions. Indeed, any 
nxn orthogonal matrix U has a real square root of dimension (n+l)x(n+l), since there exists an element 
of SO (n + 1) that contains C/ as a submatrix. With either choice, the price we pay is that our n-dimensional 
theory can be fully described only in n + 1 dimensions. But the (n + l)-dimensional theory requires n + 2 
dimensions to describe, and so on ad infinitum — unless we declare that the (n + l)*** dimension is physically 
different from dimensions 1 to n. 
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